大型语言模型(LLMS)具有变革性。它们是预先训练的基础模型,可以通过微调来适应许多不同的自然语言任务,以前每个任务都需要单独的网络模型。这是接近人类语言的非凡多功能性的一步。 GPT-3和最近的LAMDA可以与人类进行对话,并在最少的启动之后与许多例子进行许多主题。但是,关于这些LLM是否了解他们在说什么或表现出智力迹象的反应。在与LLM的三次访谈中得出截然不同的结论中,这种较高的差异显示出来。发现了一种新的可能性,可以解释这种分歧。实际上,LLM中似乎是智慧的是反映面试官智力的镜子,这是一个显着的转折,可以被视为反向图灵测试。如果是这样,那么通过研究访谈,我们可能会更多地了解面试官的智力和信念,而不是LLM的智能。
translated by 谷歌翻译
We present temporally layered architecture (TLA), a biologically inspired system for temporally adaptive distributed control. TLA layers a fast and a slow controller together to achieve temporal abstraction that allows each layer to focus on a different time-scale. Our design is biologically inspired and draws on the architecture of the human brain which executes actions at different timescales depending on the environment's demands. Such distributed control design is widespread across biological systems because it increases survivability and accuracy in certain and uncertain environments. We demonstrate that TLA can provide many advantages over existing approaches, including persistent exploration, adaptive control, explainable temporal behavior, compute efficiency and distributed control. We present two different algorithms for training TLA: (a) Closed-loop control, where the fast controller is trained over a pre-trained slow controller, allowing better exploration for the fast controller and closed-loop control where the fast controller decides whether to "act-or-not" at each timestep; and (b) Partially open loop control, where the slow controller is trained over a pre-trained fast controller, allowing for open loop-control where the slow controller picks a temporally extended action or defers the next n-actions to the fast controller. We evaluated our method on a suite of continuous control tasks and demonstrate the advantages of TLA over several strong baselines.
translated by 谷歌翻译
The General Associative Memory Model (GAMM) has a constant state-dependant energy surface that leads the output dynamics to fixed points, retrieving single memories from a collection of memories that can be asynchronously preloaded. We introduce a new class of General Sequential Episodic Memory Models (GSEMM) that, in the adiabatic limit, exhibit temporally changing energy surface, leading to a series of meta-stable states that are sequential episodic memories. The dynamic energy surface is enabled by newly introduced asymmetric synapses with signal propagation delays in the network's hidden layer. We study the theoretical and empirical properties of two memory models from the GSEMM class, differing in their activation functions. LISEM has non-linearities in the feature layer, whereas DSEM has non-linearity in the hidden layer. In principle, DSEM has a storage capacity that grows exponentially with the number of neurons in the network. We introduce a learning rule for the synapses based on the energy minimization principle and show it can learn single memories and their sequential relationships online. This rule is similar to the Hebbian learning algorithm and Spike-Timing Dependent Plasticity (STDP), which describe conditions under which synapses between neurons change strength. Thus, GSEMM combines the static and dynamic properties of episodic memory under a single theoretical framework and bridges neuroscience, machine learning, and artificial intelligence.
translated by 谷歌翻译
To date, little attention has been given to multi-view 3D human mesh estimation, despite real-life applicability (e.g., motion capture, sport analysis) and robustness to single-view ambiguities. Existing solutions typically suffer from poor generalization performance to new settings, largely due to the limited diversity of image-mesh pairs in multi-view training data. To address this shortcoming, people have explored the use of synthetic images. But besides the usual impact of visual gap between rendered and target data, synthetic-data-driven multi-view estimators also suffer from overfitting to the camera viewpoint distribution sampled during training which usually differs from real-world distributions. Tackling both challenges, we propose a novel simulation-based training pipeline for multi-view human mesh recovery, which (a) relies on intermediate 2D representations which are more robust to synthetic-to-real domain gap; (b) leverages learnable calibration and triangulation to adapt to more diversified camera setups; and (c) progressively aggregates multi-view information in a canonical 3D space to remove ambiguities in 2D representations. Through extensive benchmarking, we demonstrate the superiority of the proposed solution especially for unseen in-the-wild scenarios.
translated by 谷歌翻译
Multilevel Stein variational gradient descent is a method for particle-based variational inference that leverages hierarchies of approximations of target distributions with varying costs and fidelity to computationally speed up inference. This work provides a cost complexity analysis of multilevel Stein variational gradient descent that applies under milder conditions than previous results, especially in discrete-in-time regimes and beyond the limited settings where Stein variational gradient descent achieves exponentially fast convergence. The analysis shows that the convergence rate of Stein variational gradient descent enters only as a constant factor for the cost complexity of the multilevel version, which means that the costs of the multilevel version scale independently of the convergence rate of Stein variational gradient descent on a single level. Numerical experiments with Bayesian inverse problems of inferring discretized basal sliding coefficient fields of the Arolla glacier ice demonstrate that multilevel Stein variational gradient descent achieves orders of magnitude speedups compared to its single-level version.
translated by 谷歌翻译
了解动态场景中的3D运动对于许多视觉应用至关重要。最近的进步主要集中在估计人类等某些特定元素的活动上。在本文中,我们利用神经运动场来估计多视图设置中所有点的运动。由于颜色相似的点和与时变颜色的点的歧义,从动态场景中对动态场景进行建模运动是具有挑战性的。我们建议将估计运动的正规化为可预测。如果已知来自以前的帧的运动,那么在不久的将来的运动应该是可以预测的。因此,我们通过首先调节潜在嵌入的估计运动来引入可预测性正则化,然后通过采用预测网络来在嵌入式上执行可预测性。所提出的框架pref(可预测性正则化字段)比基于最先进的神经运动场的动态场景表示方法在PAR或更好的结果上取得了更好的成绩,同时不需要对场景的先验知识。
translated by 谷歌翻译
联合学习(FL)是一种机器学习范式,本地节点在培训数据保持分散时进行了协作训练中心模型。现有的FL方法通常共享模型参数或采用共同依据来解决不平衡数据分布的问题。但是,他们患有沟通瓶颈。更重要的是,他们有隐私泄漏的风险。在这项工作中,我们在FL框架中开发了一种隐私和沟通高效方法,并使用未标记的跨域公共数据进行单次离线知识蒸馏。我们提出了一个量化的和嘈杂的本地预测合奏,从经过全面训练的本地模型中,以确保更强的隐私保证而无需牺牲准确性。基于有关图像分类和文本分类任务的广泛实验,我们表明,我们的隐私方法优于基线FL算法,其精度和沟通效率都具有出色的性能。
translated by 谷歌翻译
全面监督的人类网格恢复方法是渴望数据的,由于3D规定基准数据集的可用性有限和多样性,因此具有较差的概括性。使用合成数据驱动的训练范例,已经从合成配对的2D表示(例如2D关键点和分段掩码)和3D网格中训练了模型的最新进展,其中已使用合成数据驱动的训练范例和3D网格进行了训练。但是,由于合成训练数据和实际测试数据之间的域间隙很难解决2D密集表示,因此很少探索合成密集的对应图(即IUV)。为了减轻IUV上的这个领域差距,我们提出了使用可靠但稀疏表示的互补信息(2D关键点)提出的交叉代理对齐。具体而言,初始网格估计和两个2D表示之间的比对误差将转发为回归器,并在以下网格回归中动态校正。这种适应性的交叉代理对准明确地从偏差和捕获互补信息中学习:从稀疏的表示和浓郁的浓度中的稳健性。我们对多个标准基准数据集进行了广泛的实验,并展示了竞争结果,帮助减少在人类网格估计中生产最新模型所需的注释工作。
translated by 谷歌翻译
本文考虑了最佳功率流(OPF)的优化代理,即近似于OPF的输入/输出关系的机器学习模型。最近的工作重点是表明此类代理可能具有高忠诚。但是,他们的培训需要大量数据,每个实例都需要(离线)解决输入分布样本的OPF。为了满足市场清除应用程序的要求,本文提出了积极的桶装采样(ABS),这是一个新型的活跃学习框架,旨在培训在一个时间限制内培训最佳OPF代理。ABS将输入分布分配到存储桶中,并使用采集函数来确定接下来的何处。它依靠自适应学习率,随着时间的推移会增加和降低。实验结果证明了ABS的好处。
translated by 谷歌翻译
心肌的准确分割和运动估计在临床领域一直很重要,这基本上有助于下游诊断。但是,现有方法不能始终保证心肌分割的形状完整性。此外,运动估计需要在不同帧上对心肌区域的点对应关系。在本文中,我们提出了一种新型的端到端深度统计形状模型,以关注具有形状完整性和边界对应关系的心肌分割。具体而言,心肌形状由固定数量的点表示,其变化是通过主成分分析(PCA)提取的。深神经网络用于预测转换参数(仿射和变形),然后将其用于将平均点云转转到图像域。此外,引入了一个可区分的渲染层,以将掩码的监督纳入框架中,以了解更准确的点云。通过这种方式,所提出的方法能够在不进行后处理的情况下始终如一地产生解剖上合理的分割掩码。此外,预测的点云还保证了顺序图像的边界对应关系,这有助于下游任务,例如心肌的运动估计。我们进行了几项实验,以证明在几个基准数据集上提出的方法的有效性。
translated by 谷歌翻译